-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Computation of compression parameters via OpenVINO models #2727
Merged
alexsu52
merged 109 commits into
openvinotoolkit:develop
from
nikita-savelyevv:compress-via-openvino
Jan 23, 2025
Merged
Computation of compression parameters via OpenVINO models #2727
alexsu52
merged 109 commits into
openvinotoolkit:develop
from
nikita-savelyevv:compress-via-openvino
Jan 23, 2025
+2,195
−293
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
github-actions
bot
added
NNCF Common
Pull request that updates NNCF Common
NNCF OpenVINO
Pull requests that updates NNCF OpenVINO
NNCF PTQ
Pull requests that updates NNCF PTQ
labels
Jun 11, 2024
alexsu52
reviewed
Jun 13, 2024
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
4 times, most recently
from
July 3, 2024 18:31
55cafaa
to
a68a63d
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
4 times, most recently
from
July 16, 2024 14:19
6b98ddd
to
3d9faa4
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
6 times, most recently
from
September 6, 2024 11:11
1c85732
to
b527cac
Compare
github-actions
bot
added
the
documentation
Improvements or additions to documentation
label
Sep 6, 2024
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
2 times, most recently
from
September 11, 2024 12:59
ac3ea02
to
2a3a63c
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
from
October 11, 2024 11:51
c9569bb
to
a151d99
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
2 times, most recently
from
October 21, 2024 08:52
fe30c13
to
19ea412
Compare
alexsu52
reviewed
Oct 22, 2024
nncf/quantization/algorithms/weight_compression/weight_lowering/dispatcher.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/weight_lowering/dispatcher.py
Outdated
Show resolved
Hide resolved
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
3 times, most recently
from
October 26, 2024 13:40
eef34f8
to
ca3447c
Compare
nikita-savelyevv
force-pushed
the
compress-via-openvino
branch
from
October 29, 2024 15:19
ca3447c
to
f3891cd
Compare
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
alexsu52
reviewed
Jan 17, 2025
alexsu52
reviewed
Jan 21, 2025
nncf/quantization/algorithms/weight_compression/openvino_backend.py
Outdated
Show resolved
Hide resolved
alexsu52
reviewed
Jan 21, 2025
AlexanderDokuchaev
requested changes
Jan 21, 2025
This reverts commit 629705c46fbbdff81b8c3d0fed2299dbc5576603.
alexsu52
reviewed
Jan 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
- Please address minor comments from the last round of review.
- The comment Computation of compression parameters via OpenVINO models #2727 (comment) will be addressed int the follow-up PR.
nncf/quantization/algorithms/weight_compression/weight_lowering.py
Outdated
Show resolved
Hide resolved
nncf/quantization/algorithms/weight_compression/openvino_modeling.py
Outdated
Show resolved
Hide resolved
AlexanderDokuchaev
approved these changes
Jan 23, 2025
alexsu52
approved these changes
Jan 23, 2025
Merged
alexsu52
pushed a commit
that referenced
this pull request
Jan 24, 2025
### Changes Follow up to #2727 1. Do not use `infer_request.results` 2. Replace `>=` with `opset.greater_equal()` 3. Rename `ov_numeric.py` to `openvino_numeric.py` ### Reason for changes 1. Improve int4 compression time by up to ~10% 2. Avoid warning: `DeprecationWarning: greater_equal is deprecated and will be removed in version 2025.3. Use ops.greater_equal instead` 3. Fix onnx install test ### Related tickets 139047 ### Tests - https://github.com/openvinotoolkit/nncf/actions/runs/12947249537 - NNCF/job/manual/job/post_training_weight_compression/301/ - NNCF/job/nightly/job/test_examples/653/
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
documentation
Improvements or additions to documentation
NNCF Common
Pull request that updates NNCF Common
NNCF OpenVINO
Pull requests that updates NNCF OpenVINO
NNCF PT
Pull requests that updates NNCF PyTorch
NNCF PTQ
Pull requests that updates NNCF PTQ
NNCF TF
Pull requests that updates NNCF TensorFlow
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
weight_lowering.py
:do_int_quantization()
is used for computing a compressed weight. Possible signatures:weight
->compressed_weight
,scale
, (zero_point
for asymmetric compression)weight
,scale
, (zero_point
) ->compressed_weight
,scale
, (zero_point
)calculate_quantized_dequantized_weight()
is used for computing a decompressed weight. Possible signatures:weight
->decompressed_weight
weight
,scale
, (zero_point
) ->decompressed_weight
weight
->decompressed_weight
,compressed_weight
,scale
, (zero_point
)weight
,scale
, (zero_point
) ->decompressed_weight
,compressed_weight
,scale
, (zero_point
)scale
andzero_point
are the same as the ones given as input (if they were given at all).openvino.Tensor
. Implementation for this backend is limited by only the required functionality, e.g. addition of OV Tensors is not supported because it is not needed.bf16
,u4
andi4
data types. For example,bf16
constants are read from an OpenVINO LLM and given as inputs to a compressing OpenVINO model.u4
andi4
compressed weights are seamlessly inserted into the resulting compressed OpenVINO model.as_numpy_tensor()
method to convert an NNCF Tensor to numpy backend. Currently only OV -> NP conversion is required.Data-free asymmetric compression:
Data-free symmetric compression:
Data-aware compression:
Reason for changes
Reducing model compression time. Only OpenVINO model compression backend is affected.
Related tickets
139047
Tests
tests/openvino/native/quantization/test_ov_modeling_compression.py::test_quantization_alignment
-- check aligment with reference numpy implementationtests/openvino/native/test_openvino_modeling.py
-- checks OV modeling framework hyperparameterstests/openvino/native/test_tensor.py
-- NNCF OV Tensor backend testsValidation jobs:
NNCF/job/manual/job/post_training_weight_compression/299/
NNCF/job/nightly/job/test_examples/650